Taking turns in general sum Markov games
نویسندگان
چکیده
This paper provides a novel approach to multi-agent coordination in general sum Markov games. Contrary to what is common in multi-agent learning, our approach does not focus on reaching a particular equilibrium between agent policies. Instead, it learns a basis set of special joint agent policies, over which it can randomize to build different solutions. The main idea is to tackle a Markov game by decomposing it into a set of multi-agent common interest problems; each reflecting one agent’s preferences in the system. With only a minimum of coordination, simple reinforcement learning agents using Parameterised Learning Automata are able to solve this set of common interest problems in parallel. As a result, a team of simple learning agents becomes able to switch play between desired joint policies rather than mixing individual policies.
منابع مشابه
A Study of Gradient Descent Schemes for General-Sum Stochastic Games
Zero-sum stochastic games are easy to solve as they can be cast as simple Markov decision processes. This is however not the case with general-sum stochastic games. A fairly general optimization problem formulation is available for general-sum stochastic games by Filar and Vrieze [2004]. However, the optimization problem there has a non-linear objective and non-linear constraints with special s...
متن کاملCyclic Equilibria in Markov Games
Although variants of value iteration have been proposed for finding Nash or correlated equilibria in general-sum Markov games, these variants have not been shown to be effective in general. In this paper, we demonstrate by construction that existing variants of value iteration cannot find stationary equilibrium policies in arbitrary general-sum Markov games. Instead, we propose an alternative i...
متن کاملLearning in Markov Games with Incomplete Information
The Markov game (also called stochastic game (Filar & Vrieze 1997)) has been adopted as a theoretical framework for multiagent reinforcement learning (Littman 1994). In a Markov game, there are n agents, each facing a Markov decision process (MDP). All agents’ MDPs are correlated through their reward functions and the state transition function. As Markov decision process provides a theoretical ...
متن کاملLearning Nash Equilibrium for General-Sum Markov Games from Batch Data
This paper addresses the problem of learning a Nash equilibrium in γ-discounted multiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although, several techniques were found for the subcase of ...
متن کاملNonzero-sum Risk-sensitive Stochastic Games on a Countable State Space
The infinite horizon risk-sensitive discounted-cost and ergodic-cost nonzero-sum stochastic games for controlled Markov chains with countably many states are analyzed. For the discounted-cost game, we prove the existence of Nash equilibrium strategies in the class of Markov strategies under fairly general conditions. Under an additional geometric ergodicity condition and a small cost criterion,...
متن کامل